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SEQUENCE LISTING 

<110> JANSEN, KATHRIN U. 
5 SCHULTZ, LOREN D. 

NEEPER, MICHAEL P. 
MARKUS, HENRY Z. 

<120> OPTIMIZED EXPRESSION OF HPV 31 LI IN 
10 YEAST 

<130> 21188P 

<150> PCT/US2004/008677 
15 <151> 2004-03-19 

<150> 60/457,172 
<151> 2003-03-23 

20 <160> 8 

<170> FASTSEQ FOR WINDOWS VERSION 4.0 

<210> 1 
25 <211> 1515 
<212> DNA 

<213> HPV31 LI WILD-TYPE 

<400> 1 

30 ATGTCTCTGT GGCGGCCTAG CGAGGCTACT GTCTACTTAC CACCTGTCCC AGTGTCTAAA 6 0 
GTTGTAAGCA CGGATGAATA TGTAACACGA ACCAACATAT ATTATCACGC AGGCAGTGCT 12 0 
AGGCTGCTTA CAGTAGGCCA TCCATATTAT TCCATACCTA AATCTGACAA TCCTAAAAAA 18 0 
ATAGTTGTAC CAAAGGTGTC AGGATTACAA TATAGGGTAT TTAGGGTTCG TTTACCAGAT 24 0 
CCAAACAAAT TTGGATTTCC TGATACATCT TTTTATAATC CTGAAACTCA ACGCTTAGTT 3 00 

35 TGGGCCTGTG TTGGTTTAGA GGTAGGTCGC GGGCAGCCAT TAGGTGTAGG TATTAGTGGT 3 60 
CATCCATTAT TAAATAAATT TGATGACACT GAAAACTCTA ATAGATATGC CGGTGGTCCT 42 0 
GGCACTGATA ATAGGGAATG TATATCAATG GATTATAAAC AAACACAACT GTGTTTACTT 480 
GGTTGCAAAC CACCTATTGG AGAGCATTGG GGTAAAGGTA GTCCTTGTAG TAACAATGCT 54 0 
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ATTACCCCTG 


GTGATTGTCC 


TCCATTAGAA 


TTAAAAAATT 


CAGTTATACA 


AGATGGGGAT 


600 


ATGGTTGATA 


CAGGCTTTGG 


AGCTATGGAT 


TTTACTGCTT 


TACAAGACAC 


TAAAAGTAAT 


660 


GTTCCTTTGG 


ACATTTGTAA 


TTCTATTTGT 


AAATATCCAG 


ATTATCTTAA 


AATGGTTGCT 


720 


GAGCCATATG 


GCGATACATT 


ATTTTTTTAT 


TTACGTAGGG 


AACAAATGTT 


TGTAAGGCAT 


780 


TTTTTTAATA 


GATCAGGCAC 


GGTTGGTGAA 


TCGGTCCCTA 


CTGACTTATA 


TATTAAAGGC 


840 


TCCGGTTCAA 


CAGCTACTTT 


AGCTAACAGT 


ACATACTTTC 


CTACACCTAG 


CGGCTCCATG 


900 


GTTACTTCAG 


ATGCACAAAT 


TTTTAATAAA 


CCATATTGGA 


TGCAACGTGC 


TCAGGGACAC 


960 


AATAATGGTA 


TTTGTTGGGG 


CAATCAGTTA 


TTTGTTACTG 


TGGTAGATAC 


CACACGTAGT 


1020 


ACCAATATGT 


CTGTTTGTGC 


TGCAATTGCA 


AACAGTGATA 


CTACATTTAA 


AAGTAGTAAT 


1080 


TTTAAAGAGT 


ATTTAAGACA 


TGGTGAGGAA 


TTTGATTTAC 


AATTTATATT 


TCAGTTATGC 


1140 


AAAATAACAT 


TATCTGCAGA 


CATAATGACA 


TATATTCACA 


GTATGAATCC 


TGCTATTTTG 


1200 


GAAGATTGGA 


ATTTTGGATT 


GACCACACCT 


CCCTCAGGTT 


CTTTGGAGGA 


TACCTATAGG 


1260 


TTTGTAACCT 


CACAGGCCAT 


TACATGTCAA 


AAAAGTGCCC 


CCCAAAAGCC 


CAAGGAAGAT 


1320 


CCATTTAAAG 


ATTATGTATT 


TTGGGAGGTT 


AATTTAAAAG 


AAAAGTTTTC 


TGCAGATTTA 


1380 


GATCAGTTTC 


CACTGGGTCG 


CAAATTTTTA 


TTACAGGCAG 


GATATAGGGC 


ACGTCCTAAA 


1440 


TTTAAAGCAG 


GTAAACGTAG 


TGCACCCTCA 


GCATCTACCA 


CTACACCAGC 


AAAACGTAAA 


1500 


AAAACTAAAA 


AGTAA 










1515 



<210> 2 
20 <211> 1515 
<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<220> 

25 <223> 31 PARTIAL REBUILD 



<400> 2 






ATGTCTCTGT 


GGCGGCCTAG 


CGAGGCTACT 


GTTGTAAGCA 


CGGATGAATA 


TGTAACACGA 


AGGCTGCTTA 


CAGTAGGCCA 


TCCATATTAT 


ATAGTTGTAC 


CAAAGGTGTC 


AGGATTACAA 


CCAAACAAAT 


TTGGATTTCC 


TGATACATCT 


TGGGCCTGTG 


TTGGTTTAGA 


GGTAGGTCGC 


CATCCATTAT 


TAAATAAATT 


TGATGACACT 


GGCACTGATA 


ATAGGGAATG 


TATATCAATG 


GGTTGCAAAC 


CACCTATTGG 


AGAGCATTGG 


ATTACCCCTG 


GTGATTGTCC 


TCCATTAGAA 


ATGGTTGATA 


CAGGCTTTGG 


AGCTATGGAT 



GTCTACTTAC CACCTGTCCC AGTGTCTAAA 60 
ACCAACATAT ATTATCACGC AGGCAGTGCT 12 0 
TCCATACCTA AATCTGACAA TCCTAAAAAA 18 0 
TATAGGGTAT TTAGGGTTCG TTTACCAGAT 24 0 
TTTTATAATC CTGAAACTCA ACGCTTAGTT 300 
GGGCAGCCAT TAGGTGTAGG TATTAGTGGT 360 
GAAAACTCTA ATAGATATGC CGGTGGTCCT 420 
GATTATAAAC AAACACAACT GTGTTTACTT 480 
GGTAAAGGTA GTCCTTGTAG TAACAATGCT 54 0 
TTAAAAAATT CAGTTATACA AGATGGGGAT 600 
TTTACTGCTT TACAAGACAC TAAAAGTAAT 660 
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GTTCCTTTGG 


ACATTTGTAA 


TTCTATTTGT 


AAATATCCAG 


ATTATCTTAA 


AATGGTTGCT 


720 




GAGCCATACG 


GCGACACCTT 


GTTCTTCTAT 


TTGCGTAGAG 


AACAGATGTT 


CGTAAGGCAC 


780 




TTCTTCAACA 


GATCCGGCAC 


CGTAGGTGAA 


TCTGTCCCAA 


CCGACCTGTA 


CATCAAGGGC 


840 




TCCGGTTCCA 


CCGCTACCCT 


GGCTAACTCC 


ACCTACTTCC 


CAACTCCATC 


TGGCTCCATG 


900 


5 


GTCACCTCCG 


ACGCTCAGAT 


CTTCAACAAG 


CCATACTGGA 


TGCAGCGTGC 


ACAGGGTCAC 


960 




AACAACGGTA 


TCTGTTGGGG 


TAACCAGCTG 


TTCGTGACTG 


TGGTCGATAC 


CACGCGTTCT 


1020 




ACCAACATGT 


CTGTCTGTGC 


TGCAATCGCT 


AACTCTGACA 


CTACCTTCAA 


GTCCTCTAAC 


1080 




TTCAAGGAGT 


ACCTGAGACA 


TGGTGAGGAA 


TTCGATCTGC 


AATTCATCTT 


CCAGTTGTGC 


1140 




AAGATCACCC 


TGTCTGCTGA 


CATCATGACC 


TACATCCACA 


GTATGAACCC 


TGCCATCCTG 


1200 


10 


GAGGACTGGA 


ACTTCGGTCT 


GACCACTCCA 


CCTTCCGGTT 


CTTTGGAGGA 


TACCTATAGG 


1260 




TTTGTAACCT 


CACAGGCCAT 


TACATGTCAA 


AAAAGTGCCC 


CCCAAAAGCC 


CAAGGAAGAT 


1320 




CCATTTAAAG 


ATTATGTATT 


TTGGGAGGTT 


AATTTAAAAG 


AAAAGTTTTC 


TGCAGATTTA 


1380 




GATCAGTTTC 


CACTGGGTCG 


CAAATTTTTA 


TTACAGGCAG 


GATATAGGGC 


ACGTCCTAAA 


1440 




TTTAAAGCAG 


GTAAACGTAG 


TGCACCCTCA 


GCATCTACCA 


CTACACCAGC 


AAAACGTAAA 


1500 


15 


AAAACTAAAA 


AGTAA 










1515 




<210> 3 
















<211> 1515 
















<212> DNA 














20 


<213> ARTIFICIAL SEQUENCE 












<220> 
















<223> 31 TOTAL REBUILD 










25 


<400> 3 
















ATGTCTTTGT 


GGAGACCATC 


TGAAGCTACC 


GTCTACTTGC 


CACCAGTCCC 


AGTCTCTAAG 


60 




GTCGTCTCTA 


CCGACGAATA 


CGTCACCAGA 


ACCAACATCT 


ACTACCACGC 


TGGTTCTGCT 


120 




AGATTGTTGA 


CCGTCGGTCA 


CCCATACTAC 


TCTATCCCAA 


AGTCTGACAA 


CCCAAAGAAG 


180 




ATCGTCGTCC 


CAAAGGTCTC 


TGGTTTGCAA 


TACAGAGTCT 


TCAGAGTCAG 


ATTGCCAGAC 


240 


30 


CCAAACAAGT 


TCGGTTTCCC 


AGACACCTCT 


TTCTACAACC 


CAGAAACCCA 


AAGATTGGTC 


300 




TGGGCTTGTG 


TCGGTTTGGA 


AGTCGGTAGA 


GGTCAACCAT 


TGGGTGTCGG 


TATCTCTGGT 


360 




CACCCATTGT 


TGAACAAGTT 


CGACGACACC 


GAAAACTCTA 


ACAGATACGC 


TGGTGGTCCA 


420 




GGTACCGACA 


ACAGAGAATG 


TATCTCTATG 


GACTACAAGC 


AAACCCAATT 


GTGTTTGTTG 


480 




GGTTGTAAGC 


CACCAATCGG 


TGAACACTGG 


GGTAAGGGTT 


CTCCATGTTC 


TAACAACGCT 


540 


35 


ATCACCCCAG 


GTGACTGTCC 


ACCATTGGAA 


TTGAAGAACT 


CTGTCATCCA 


AGACGGTGAC 


600 




ATGGTCGACA 


CCGGTTTCGG 


TGCTATGGAC 


TTCACCGCTT 


TGCAAGACAC 


CAAGTCTAAC 


660 




GTCCCATTGG 


ACATCTGTAA 


CTCTATCTGT 


AAGTACCCAG 


ACTACTTGAA 


GATGGTCGCT 


720 




GAACCATACG 


GCGACACCTT 


GTTCTTCTAC 


TTGCGTAGAG 


AACAGATGTT 


CGTAAGGCAC 


780 
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TTCTTCAACA GATCCGGCAC CGTAGGTGAA 
TCCGGTTCCA CCGCTACCCT GGCTAACTCC 
GTCACCTCCG ACGCTCAGAT CTTCAACAAG 
AACAACGGTA TCTGTTGGGG TAACCAGCTG 
5 ACCAACATGT CTGTCTGTGC TGCAATCGCT 
TTCAAGGAGT ACCTGAGACA TGGTGAGGAA 
AAGATCACCC TGTCTGCTGA CATCATGACC 
GAGGACTGGA ACTTCGGTCT GACCACTCCA 
TTCGTCACCT CTCAAGCTAT CACCTGTCAA 
10 CCATTCAAGG ACTACGTCTT CTGGGAAGTC 
GACCAATTCC CATTGGGTAG AAAGTTCTTG 
TTCAAGGCTG GTAAGAGATC TGCTCCATCT 
AAGACCAAGA AGTAA 



TCTGTCCCAA CCGACCTGTA CATCAAGGGC 84 0 
ACCTACTTCC CAACTCCATC TGGCTCCATG 900 
CCATACTGGA TGCAGCGTGC ACAGGGTCAC 960 
TTCGTGACTG TGGTCGATAC CACGCGTTCT 1020 
AACTCTGACA CTACCTTCAA GTCCTCTAAC 1080 
TTCGATCTGC AATTCATCTT CCAGTTGTGC 1140 
TACATCCACA GTATGAACCC TGCCATCCTG 1200 
CCTTCCGGTT CTTTGGAAGA CACCTACAGA 1260 
AAGTCTGCTC CACAAAAGCC AAAGGAAGAC 1320 
AACTTGAAGG AAAAGTTCTC TGCTGACTTG 13 8 0 
TTGCAAGCTG GTTACAGAGC TAGACCAAAG 144 0 
GCTTCTACCA CCACCCCAGC TAAGAGAAAG 1500 

1515 



15 <210> 4 

<211> 504 
<212> PRT 

<213> ARTIFICIAL SEQUENCE 



20 <220> 

<223> HPV 31 LI 



<400> 4 

MET SER LEU TRP 
25 1 

PRO VAL SER LYS 
20 

ILE TYR TYR HIS 

35 

30 TYR TYR SER ILE 
50 

LYS VAL SER GLY 
65 

PRO ASN LYS PHE 

35 

GLN ARG LEU VAL 
100 

PRO LEU GLY VAL 



ARG PRO SER GLU ALA THR 

5 10 
VAL VAL SER THR ASP GLU 

25 

ALA GLY SER ALA ARG LEU 

40 

PRO LYS SER ASP ASN PRO 
55 

LEU GLN TYR ARG VAL PHE 
70 

GLY PHE PRO ASP THR SER 
85 90 
TRP ALA CYS VAL GLY LEU 

105 

GLY ILE SER GLY HIS PRO 



VAL TYR LEU PRO PRO VAL 

15 

TYR VAL THR ARG THR ASN 
30 

LEU THR VAL GLY HIS PRO 

45 

LYS LYS ILE VAL VAL PRO 

60 

ARG VAL ARG LEU PRO ASP 
75 80 
PHE TYR ASN PRO GLU THR 

95 

GLU VAL GLY ARG GLY GLN 
110 

LEU LEU ASN LYS PHE ASP 



21188P 



115 120 125 

ASP THR GLU ASN SER ASN ARG TYR ALA GLY GLY PRO GLY THR ASP ASN 

130 135 140 

ARG GLU CYS ILE SER MET ASP TYR LYS GLN THR GLN LEU CYS LEU LEU 
5 145 150 155 160 

GLY CYS LYS PRO PRO ILE GLY GLU HIS TRP GLY LYS GLY SER PRO CYS 

165 170 175 

SER ASN ASN ALA ILE THR PRO GLY ASP CYS PRO PRO LEU GLU LEU LYS 
180 185 190 

10 ASN SER VAL ILE GLN ASP GLY ASP MET VAL ASP THR GLY PHE GLY ALA 
195 200 205 

MET ASP PHE THR ALA LEU GLN ASP THR LYS SER ASN VAL PRO LEU ASP 

210 215 220 

ILE CYS ASN SER ILE CYS LYS TYR PRO ASP TYR LEU LYS MET VAL ALA 
15 225 230 235 240 

GLU PRO TYR GLY ASP THR LEU PHE PHE TYR LEU ARG ARG GLU GLN MET 

245 250 255 

PHE VAL ARG HIS PHE PHE ASN ARG SER GLY THR VAL GLY GLU SER VAL 
260 265 270 

20 PRO THR ASP LEU TYR ILE LYS GLY SER GLY SER THR ALA THR LEU ALA 
275 280 285 

• ASN SER THR TYR PHE PRO THR PRO SER GLY SER MET VAL THR SER ASP 
290 295 300 

ALA GLN ILE PHE ASN LYS PRO TYR TRP MET GLN ARG ALA GLN GLY HIS 
25 305 310 315 320 

ASN ASN GLY ILE CYS TRP GLY ASN GLN LEU PHE VAL THR VAL VAL ASP 

325 330 335 

THR THR ARG SER THR ASN MET SER VAL CYS ALA ALA ILE ALA ASN SER 
340 345 350 

30 ASP THR THR PHE LYS SER SER ASN PHE LYS GLU TYR LEU ARG HIS GLY 
355 360 365 

GLU GLU PHE ASP LEU GLN PHE ILE PHE GLN LEU CYS LYS ILE THR LEU 

370 375 380 

SER ALA ASP ILE MET THR TYR ILE HIS SER MET ASN PRO ALA ILE LEU 
35 385 390 395 400 

GLU ASP TRP ASN PHE GLY LEU THR THR PRO PRO SER GLY SER LEU GLU 

405 410 415 

ASP THR TYR ARG PHE VAL THR SER GLN ALA ILE THR CYS GLN LYS SER 

-5- 
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420 

ALA PRO GLN LYS PRO 
435 

GLU VAL ASN LEU LYS 
5 450 

LEU GLY ARG LYS PHE 
465 

PHE LYS ALA GLY LYS 
485 

10 ALA LYS ARG LYS LYS 

500 



425 

LYS GLU ASP PRO PHE LYS 
440 

GLU LYS PHE SER ALA ASP 
455 

LEU LEU GLN ALA GLY TYR 
470 475 
ARG SER ALA PRO SER ALA 

490 

THR LYS LYS 



430 

ASP TYR VAL PHE TRP 
445 

LEU ASP GLN PHE PRO 
460 

ARG ALA ARG PRO LYS 

480 

SER THR THR THR PRO 
495 



<210> 5 
15 <211> 34 
<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<220> 

20 <223> PCR PRIMER 



<400> 5 

CGTCGACGTA AACGTGTATC ATATTTTTTT ACAG 34 

25 <210> 6 
<211> 25 
<212> DNA 

<213> ARTIFICIAL SEQUENCE 



30 <220> 

<223> PCR PRIMER 



<400> 6 

CAGACACATG TATTACATAC ACAAC 2 5 

<210> 7 
<211> 41 
<212> DNA 

-6- 



21188P 



<213> ARTIFICIAL SEQUENCE 
<220> 

<223> PCR PRIMER 

5 

<400> 7 

CTCAGATCTC ACAAAACAAA ATGTCTCTGT GGCGGCCTAG C 41 

<210> 8 
10 <211> 38 
<212> DNA 

<213> ARTIFICIAL SEQUENCE 

<220> 

15 <223> PCR PRIMER 
<400> 8 

GACAGATCTT ACTTTTTAGT TTTTTTACGT TTTGCTGG 3 8 
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